Improving Lazy Attribute Selection
نویسندگان
چکیده
Attribute selection is a data preprocessing step which aims at identifying relevant attributes for a target data mining task – specifically in this article, the classification task. Previously, we have proposed a new attribute selection strategy – based on a lazy learning approach – which postpones the identification of relevant attributes until an instance is submitted for classification. Experimental results showed the effectiveness of the technique, as in most cases it improved the accuracy of classification, when compared with the analogous eager attribute selection approach performed as a data preprocessing step. However, in the previously proposed approach, the performance of the classifier depends on the number of attributes selected, which is a user-defined parameter. In practice, it may be difficult to select a proper value for this parameter, that is, the value that produces the best performance for the classification task. In this article, aiming to overcome this drawback, we propose two approaches to be used coupled with lazy attribute selection technique: one that tries to identify, in a wrapper-based manner, the appropriate number of attributes to be selected and another that combines, in a voting approach, different numbers of attributes. Experimental results show the effectiveness of the proposed techniques. The assessment of these approaches confirms that the lazy learning paradigm can be compatible with traditional methods and appropriate for a large number of applications.
منابع مشابه
Lazy attribute selection: Choosing attributes at classification time
Attribute selection is a data preprocessing step which aims at identifying relevant attributes for the target machine learning task – namely classification in this paper. In this paper, we propose a new attribute selection strategy – based on a lazy learning approach – which postpones the identification of relevant attributes until an instance is submitted for classification. Our strategy relie...
متن کاملLBR-Meta: An Efficient Algorithm for Lazy Bayesian Rules
LBR is a highly accurate classification algorithm, which lazily constructs a single Bayesian rule for each test instance at classification time. However, its computational complexity of attribute-value pair selection is quadratic to the number of attributes. This fact incurs high computational costs, especially for datasets of high dimensionality. To solve the problem, this paper proposes an ef...
متن کاملUse of Attribute Selection Criteria in Decision Trees in Uncertain Domains
There are many attribute selection criteria involved in the induction of decision trees. We present criteria derived from an impurity measure in a unified framework and we use the same background to describe ORT criterion [12]. We compare the theoretical backgrounds of C.M. criteria with those of the ORT criterion and we illustrate their differences. We set lazy decision trees with regards to t...
متن کاملEager, Lazy and Hybrid Algorithms for Multi-Criteria Associative Classification
Classification aims to map a data instance to its appropriate class (or label). In associative classification the mapping is done through an association rule with the consequent restricted to the class attribute. Eager associative classification algorithms build a single rule set during the training phase, and this rule set is used to classify all test instances. Lazy algorithms, however, do no...
متن کاملNeighborhood classifiers
K nearest neighbor classifier (K-NN) is widely discussed and applied in pattern recognition and machine learning, however, as a similar lazy classifier using local information for recognizing a new test, neighborhood classifier, few literatures are reported on. In this paper, we introduce neighborhood rough set model as a uniform framework to understand and implement neighborhood classifiers. T...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JIDM
دوره 2 شماره
صفحات -
تاریخ انتشار 2011